18 research outputs found

    On the design of robust classifiers for computer vision

    Full text link
    The design of robust classifiers, which can contend with the noisy and outlier ridden datasets typical of computer vision, is studied. It is argued that such robustness requires loss functions that penalize both large positive and negative margins. The probability elicitation view of classifier design is adopted, and a set of necessary conditions for the design of such losses is identified. These conditions are used to derive a novel robust Bayes-consistent loss, denoted Tangent loss, and an associated boosting algorithm, denoted TangentBoost. Experiments with data from the computer vision problems of scene classification, object tracking, and multiple instance learning show that TangentBoost consistently outperforms previous boosting algorithms. 1

    The design of Bayes consistent loss functions for classification

    No full text
    The combination of using loss functions that are both Bayes consistent and margin enforcing has lead to powerful classification algorithms such as AdaBoost that uses the exponential loss and logistic regression and LogitBoost that use the logistic loss. The use of Bayes consistent margin enforcing losses along with efficient optimization techniques has also lead to other successful classification algorithms such as SVM classifiers that use the hinge loss function. The success of boosting and SVM classifiers is not surprising when looked at from the standpoint of Bayes consistency. Such algorithms are all based on Bayes consistent loss functions and so are guaranteed to converge to the Bayes optimal decision rule as the number of training samples increases. Despite the importance and success of Bayes consistent loss functions, the number of such known loss functions has remained small in the literature. This is in part due to the fact that a generative method for deriving such loss functions did not exist. Not having a generative method for deriving Bayes consistent loss functions not only prevents one from effectively designing loss functions with certain shapes, but also prevents a full analysis and taxonomy of the possible shapes and properties that such loss function can have. In this thesis we solve these problems by providing a generative method for deriving Bayes consistent loss functions. We also fully analyze such loss functions and explore the design of loss functions with certain shapes and properties. This is achieved by studying and relating the two fields of risk minimization in machine learning and probability elicitation in statistics. Specifically, the class of Bayes consistent loss functions is partitioned into different varieties based on their convexity properties. The convexity properties of the loss and associated risk of Bayes consistent loss functions are also studied in detail which, for the first time, enable the derivation of non convex Bayes consistent loss functions. We also develop a fully constructive method for the derivation of novel canonical loss functions. This is due to a simple connection between the associated minimum conditional risk and optimal link functions. The added insight allows us to derive variable margin losses with explicit margin control. We then establish a common boosting framework, canonical gradientBoost, for building boosting classifiers from all canonical losses. Next, we extend the probability elicitation view of loss function design to the problem of designing robust loss functions for classification. The robust Savage loss and corresponding SavageBoost algorithm are derived and shown to outperform other boosting algorithms on a set of experiments designed to test the robustness of the algorithms to outliers in the training data. We also argue that a robust loss should penalizes both large positive and large negative margins. The Tangent loss and the associated TangentBoost classifier are derived with the desired robust properties. We also develop a general framework for the derivation of Bayes consistent cost sensitive loss functions. This is then used to derive a novel cost sensitive hinge loss function. A cost-sensitive SVM learning algorithm is then derived. Unlike previous SVM algorithms, the one now proposed is shown to enforce cost sensitivity for both separable and non-separable training data, independent of the choice of slack penalty. Finally, we present a novel framework for the design of cost-sensitive boosting algorithms. The proposed framework is used to derive cost-sensitive extensions of AdaBoost, RealBoost and LogitBoost. Experimental evidence, over different machine learning and computer vision problems is presented in support of the new algorithm

    High detection-rate cascades for real-time object detection

    No full text
    A new strategy is proposed for the design of cascaded object detectors of high detection-rate. The problem of jointly minimizing the false-positive rate and classification complexity of a cascade, given a constraint on its detection rate, is considered. It is shown that it reduces to the problem of minimizing false-positive rate given detectionrate and is, therefore, an instance of the classic problem of cost-sensitive learning. A cost-sensitive extension of boosting, denoted by asymmetric boosting, is introduced. It maintains a high detection-rate across the boosting iterations, and allows the design of cascaded detectors of high overall detection-rate. Experimental evaluation shows that, when compared to previous cascade design algorithms, the cascades produced by asymmetric boosting achieve significantly higher detection-rates, at the cost of a marginal increase in computation. 1
    corecore